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PROGRAMMING  ENVIRONMENTS  BASED  ON  STRUCTURED  EDITORS  : 


THE  MENTOR  EXPERIENCE 

V^ronique  D0N2EAU -GOUGE,  Girard  HUET,  Gilles  KAHN  and  Bernard  LANG 


Risume  ; 


Nous  analysons  dans  cette  note  Inexperience  acquise  avec  le  systime 
de  manipulation  de  programmes  MENTOR,  en  mettant  1 ’accent  sur  les  points 
suivants  : 

-  les  principales  decisions  prises  lors  de  la  conception  de  MENTOR 
“  notre  experience  dans  la  construction  et  1 ' utilisation  d'un 
environnement  de  programmation  PASCAL  construit  d  partir  de 
MENTOR  ; 

“  notre  conception  de  ce  que  devrait  etre  un  environnement  de 
programmation  complet. 


Abstract  ^ 

I  *  , 

We  discuss  in  this  note  our  experience  with  MENTOR  program 
manipulation  system,  from  the  following  points  of  vie* 

the  main  design  decisions '^^made  in  MENTOR  :  *  . 

^  our  experience  with  building  and  using  a  PASCAL  programming 
environment  based  on  MENTOR  ;  ^  ^ 

-  our  "Vision  of  a  complete  programming  environment. 


Programming  Environment!  Bated  Structured  Edltoni 
The  MENTOR  Experience 

Vdronique  Donteau*Gouge,  Gtnrd  Huety  GUlei  Kahn  and  Bernard  Lang 

Abitraci 

We  diicuit  in  this  note  our  experience  with  the  MENTOR  program  manipulation 
Bjttem,  from  the  following  point!  of  view: 

•  The  main  detign  deciiioni  we  made  in  MENTOR; 

•  Our  experience  with  building  and  uiing  a  PASCAL  programming  environment 
baaed  on  MENTOR; 

•  Our  vision  of  a  complete  programming  environment. 

1.  A  MENTOR  primer 

MENTOR  is  a  processor  designed  to  manipulate  structured  data.  This  data  is 
represented  as  operator<operand  trees,  generallj  called  abstract  ^ntax  treea  MENTOR 
is  driven  by  the  tree  manipulation  language  MENTOL. 

1.1  Abitrtet  Syntu 

Abstract  syntax  trees  are  structured  as  sorted  algebras;  ,for  a  given  language,  we 
declare  a  set  of  sorts,  and  a  set  of  operators  with  sorted  operands.  Operators  may  be 
declared  with  a  fixed  arity,  or  may  ^  associative  operators  with  a  variable  number  of 
arguments,  used  to  represent  lists.  We  must  also  specify  a  parser,  which,  given  a  sort, 
maps  a  concrete  syntax  string  into  the  corresponding  abstract  syntax  tree,  and  some 
standard  inverse  mapping,  the  prettyprinting  unparser. 

For  instance,  in  MENTOR-PASCAL,  typical  sorts  are  exp,  sfaf,  «ar6f,  ident,  const, 
toxp,  htat.  Every  meaningful  PASCAL  construct  corresponds  to  an  operator.  Typical 
operators  are  if,  ass,  call,  Istat,  gtr,  mult,  index,  with  sorts  as  follows: 

•  if:  expx  »tatX 

•  ass:  varblxexp-*»tat. 

•  call:  sdentx  lezp-»stat. 

•  Istat:  $tatx»tatx-'- 

•  lexp;  exp  X  expx  •  •  •  X  txp-*lexp. 

•  gtr:  expXexp-»exp. 

•  molt;  expXcxp-*exp. 

•  index:  tdent  X  itxp-*exp. 

Also,  all  identifiers  and  constants  are  nullary  operators,  of  sort  respectively  s'dent 
and  const.  Finally,  our  sorts  are  ordered;  for  instance,  sdentCvarblCesp,  eenstCesp 
and  /state stat.  In  any  argument  place  of  sort  o,  all  operators  returning  sort  o'  C  w 
are  authorised. 


Example 

The  following  PASCAL  program  : 


Arcer*?j1cn  Tor 
lt:: 

l-IhRw  . 


_ 

D*s,  . 


Coa©a 


if  X>0  th«fl  P(X,ACY«Z]> 

•its  begin 
y:a:y*2; 

XrsO 

•nd 

puMi  into  (and  ii  the  unpariing  of)  the  following  abstract  syntax  tree: 


/\ 


Y  Z 


1.2  MENTOL,  3  tree  manipu/ation  language 

The  user  communicates  with  MENTOR  through  an  interpreter  for  a  specialised 
tree  manipulation  language,  MENTOL.  Values  in  MENTOL  are  abstract  syntax  trees 
(abbreviated  ast  from  now  on)  and  locations  in  these  trees ,  abbreviated  loe.  MENTOL 
commands  are  themselves  ast’s  in  MENTOR-MENTOL.  MENTOL  variables,  called 
markers,  may  be  assigned  loci.  A  loc  expression  is  obtained  by  composing  a  base  marker 
with  displacement  operators  such  as  U,L,R  for  up, left, right,  or  Sn,  with  n  an  integer,  for 
n-th  son.  For  instance,  if  marker  QTOP  marks  the  top  of  the  above  PASCAL  tree,  OTOP 
S2  SI  marks  the  location  of  identitier  P.  The  current  marker.  6K,  may  be  abbreviated 
by  the  empty  string  for  convenience.  The  MENTOL  assignment  statement,  of  the  form 
loel:loe2,  is  used  to  move  around  in  trees,  and  remember  places.  For  instance,  :OTOP 
S2  SI  would  assign  the  current  marker  to  the  location  of  P  in  the  tree  above. 

The  command  loc  Pn  prints  on  the  console  the  result  of  unpariing  the  ast  at  loc, 
down  to  a  level  of  detail  specified  by  integer  n.  For  instance,  OTOP  P2  would  print: 

if  «  than  «  . . . 

Note  that  list  nodes  are  abbreviated  by  ...,  and  other  nodes  by  #.  The  command 
OTOP  P3  would  give  you  some  more  detail: 


If  X>0  then  P(X.ACY.Z]> 
bagln 
«:« 

•nd 

The  standard  prettyprinting  effected  by  MENTOR  puts  PASCAL  reserved  words 
in  lower  case,  identifiers  in  upper  case,  and  indicates  the  tree  structure  by  indentation. 
When  the  level  of  detail  is  unspecified,  you  get  a  standard  abbreviation  that  in  most 
cases  fits  in  one  screen.  For  instance,  OTOP  P  would  produce  the  text  above  in  full. 
The  reverse  operation  of  P  is  &,  which  is  an  expression  denoting  the  result  of  parsing  a 
string  of  characters  read  on  the  input  device. 

An  essential  feature  of  MENTOL  is  pattern-matching.  A  pattern  or  schema,  is  any 
ast  containing  special  terminal  nodes  called  metavariables.  A  schema  matches  ar^y  tree 
which  is  an  instance  of  the  pattern,  replacing  metavariables  by  appropriate  subtrees.  A 
given  metavariable  may  appear  only  once  in  a  given  pattern.  Metavariables  are  unparsed 
as  special  identifiers,  whose  name  starts  with  a  dollar  sign.  Schemas  may  be  constructed 
by  commands,  or  may  be  input  through  the  parser.  When  the  syntax  tables  for  a 
given  language  are  leaded,  MENTOR  constructs  a  set  of  predefined  schemas,  one  for 
each  operator.  These  elementary  schemas  consist  of  just  the  given  operator,  applied 
to  metavariables;  they  are  accessible  through  a  marker  named  by  *^he  operator.  For 
instance,  in  KfENTOR-PASCAL,  we  would  have: 

fOGTR  Pf  3SE:xr'i>$EXP2 

The  find  expression  ioc  F  pat  denotes  the  first  location  in  the  subtree  marked  by 
loc  which  is  an  instance  o!  the  pattern  pat  (assuming  preorder  traversal.)  if  the  subtree 
does  not  contain  any  such  instance,  the  special  value  fail  is  returned.  Another  find 
expression,  with  F  replaced  by  FF,  does  not  limit  the  search  to  the  subtree  marked 
by  loc;  that  is,  the  search  is  continued  in  preorder  beyond  loc  (this  ir.  the  search  one 
ordinarily  does  in  the  listing  of  a  program,  starting  from  a  given  point.)  When  pattern- 
matching  is  successful,  the  markers  with  same  name  as  the  metavariables  of  the  schema 
are  assigned  to  the  corresponding  location  in  the  object  tree.  For  instance,  with  the 
above  example; 

rOTOP  F  OCTR  P^ 

x>o 

rOBXPi  p^ 

X 

Let  us  now  explain  briefly  the  main  commands  and  control  s'.f'irf...i-.»8  j.i  MENTOL. 
When  a  loc  expre.csion  is  used  at  a  command,  it  abbreviates  the  op<-'at‘'.>»^  '•f  r.  S'gning  to 
the  base  marker  of  loc  the  result  of  evaluating  loc.  For  instant'’.  >'■  '  •  i  '.r  above, 

a  more  typical  operation  would  be  to  use  the  current  marker  as  follc'^  s 

f  :OTOP;rOOTR;P^ 

X>0 
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Note  the  tequencer  semicolon.  A  sequence  of  commands  is  made  into  a  command 
(abbreviated  com)  by  enclosing  it  between  parentheses  (these  are  not  mandatory  at  top 
level.)  Any  command  may  be  iterated  n  times  by  postflzing  it  with  integer  n.  A  star 
means  iterate  until  failure;  for  instance,  U*  brings  the  current  marker  to  the  top  of  the 
tree  it  was  pointing  into.  A  primitive  conditional  statement  is  provided:  ?eoml,com2 
executes  coml  if  the  previous  command  succeeded,  otherwise  it  executes  eom2.  The 
command  $n  exits  from  n  levels  of  grouping;  $>n  is  the  same,  but  returns  failure.  There 
are  various  other  control  statements  such  as  a  case  statement,  which  we  shall  not  discuss 
here.  The  interested  reader  is  referred  to  MENTOR’S  manual]  2].  Let  us  now  turn  to 
the  commands  that  modify  asti. 

The  change  command,  loci  C  ast2,  replaces  the  subtree  marked  by  loci  with  the 
tree  ast2.  Like  in  Algol  60,  a  form  of  coercion  is  provided:  When  the  second  argument  of 
the  change  command  is  some  location  expression  loc2,  it  denotes  a  copy  of  the  subtree 
marked  by  loc2.  Various  list  manipulation  commands,  such  as  insert  (1)  and  delete  (D), 
are  provided,  loci  X  loe2  exchanges  the  subtrees  at  loci  and  loc2  (provided  they  are 
disjoint).  All  these  operations  maintain  the  correctness  of  sorts.  Rather  than  giving  an 
exhaustive  list,  let  us  give  a  few  examples: 

r  :®TOP^ 
rsx  X  83^ 
rsa  81  z  83; 
rsa  c  A; 

CSTAT]  ;Z:=0: ;  Xcolon  l«  proapt  for  parelng;net*  r.b«  sort  roalntfar 

rsi  X  83; 

ERROR:  IRONS  OVNTAX  TYPE 

if  X>0  tb«n 
begin 
Y;sY*2; 

PCX.ACY.Zl); 

X.*0 

end  ela#  Z:=0 

Let  us  finally  explain  an  essential  command;  eval;  E  ast  returns  a  copy  of  the  tree 
ast,  in  which  metavariables  are  instantiated  according  to  the  current  ervironmcnt.  The 
eval  command,  together  with  pattern>matching,  permits  to  implement  easily  program 
transformations  that  can  be  described  as  tree  rewriting  systems  For  instance,  assume 
we  want  to  transform  the  operator  >  in  the  PASCAL  cxemplc  abeve  into  the  operator 
>=.  Assuming  the  current  marker  is  initially  positioned  at  TOP.  the  simplest  way  of 
doing  this  in  MENTOL  is  as  follows: 

rFOoTRiCEO0Eq:P; 

x>«o 

fOOEQ  P;  RThla  worke  beeauaa  ((•tavArlBmotf  Match 

tBXPl>s$EXP2 

We  have  not  explained  so  far  how  we  dealt  with  comments,  and  niore  generally  with 
pragmats  and  assertions.  We  have  designed  a  general  mechanism  thas  ccminenta 
into  account  as  a  particular  case  of  various  possible  annotations,  meaningfully  related 


to  program  constructi.  The  idea  ii  to  attach  attributes  to  any  node  of  an  ut.  These 
attributes  are  themselves  asts  in  their  own  language.  The  loc  expressions  are  extended 
BO  as  to  access  the  various  attributes  of  a  given  node  and  to  get  back  from  an  ast  to 
the  ast  node  it  annotates,  if  any.  For  instance,  in  MENTOR-PASCAL,  two  attributes 
ue  reserved  for  ordinary  comments:  the  so-called  prefix  and  postfix  comments.  These 
simple  comments  have  a  rather  poor  structure:  they  may  be  just  lists  of  lines.  We 
sdso  use  comments  in  PASCAL  abstract  syntax;  for  instance,  when  we  optimise  some 
portion  of  program,  we  keep  the  initial  version  of  the  construct  as  a  comment.  The 
system  is  extensible;  for  instance,  we  may  declare  a  new  abstract  syntax  for  assertions, 
and  annotate  various  constructs  with  them,  write  in  MENTOL  a  verification  condition 
generator  that  will  compute  from  these  assertions,  etc. 

If  MENTOL  consisted  only  in  the  features  mentioned  so  far,  the  reader  would 
question  our  calling  it  a  programming  language,  and  would  probably  argue  that  is 
is  nothing  more  than  an  editor  command  language.  What  makes  MENTOL  a  full- 
fledged  (although  not  general-purpose)  programming  language  is  the  possibility  to  write 
MENTOL  procedures.  We  permit  three  kinds  of  procedure  parameters: 

a)  Iocs  passed  by  value; 

b)  Iocs  passed  by  reference; 

c)  corns  passed  by  name. 

For  instance  a  sta.ndard  predefined  procedure  is  FORALL,  which  takes  two  argu¬ 
ments:  a  pattern,  :ind  a  command.  For  every  instance  of  the  pattern,  starting  from 
the  current  marker  and  with  a  preorder  tree  traversal,  it  executes,  its  second  argument. 
Various  utilities  procedures  are  predefined,  to  generate  new  identifiers, and  provide  coer¬ 
cions  mechanisms  such  as  between  identifiers,  strings  and  comment  lines.  Finally  stand¬ 
ard  system  procedures  are  provided  for  file  manipulation,  interactive  help  and  debugging, 
etc. 

This  procedure  encapsulation  mechanism  is  essential  to  MENTOR.  It  allows  the 
designer  of  a  programming  environment  to  provide  the  user  with  powerful  program 
manipulations  in  terms  of  the  logical  constructs  of  the  specific  programming  language 
manipulated.  These  manipulations  can  be  heavily  context  dependent,  and  may  use 
semantic  knowledge  of  the  programming  constructs,  as  opposed  to  the  purely  structural 
context-free  manipulations  of  the  MENTOL  primitives.  Finally,  it  allows  to  build 
extensible  systems,  in  which  the  user  constructs  and  maintains  h'r  own  environment 
of  procedures. 


1.3  A  PASCAL  ProgTHtncuug  Environment  Based  on  tAENTOR 

MENTOR  is  a  general  system  to  manipulate  structured  informal i  .«n.  Ficwever  from 
the  start  we  intended  as  its  main  application  the  realisation  of  an  iiiteractive  program¬ 
ming  environment  in  which  a  programmer  may  design,  implement,  docomenl,  debug, 
test,  validate,  maintain  and  transport  his  programs.  PHirthcrmorc  we  inteii<lcd  this  en¬ 
vironment  to  be  realistic  enough  to  help  in  implementing  large  software  Developments, 
and  provide  a  programming  team  with  tools  for  specifying  a  design,  eii)>,rcing  a  program¬ 
ming  methodology  and  verifying  interfaces.  Our  intention  when  we  darted  the  project, 
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ai  the  end  of  1974,  was  to  try  and  bridge  the  gap  between  on  the  one  hand  existing 
programming  tools  such  as  debugging  compilers,  and  on  the  other  hand  the  vast  amount 
of  theoretical  research  on  semantics  of  programming  languages.  At  the  same  time,  we 
did  not  want  to  commit  ourselves  to  any  currently  proposed  programming  methodology 
(top-down  design,  structured  programming,  etc.)  or  formalism  (flrst-order  assertions, 
Hoare  rules,  modal  logic),  for  which  a  wide  consensus  did  not  exist.  Rather,  we  wanted 
our  system  to  be  general  enough  to  accomodate  these  various  formalisms  and  provide 
tools  to  implement  the  proposed  methodologies.  We  chose  to  implement  a  PASCAL  en¬ 
vironment  around  the  MENTOR  system  for  several  reasons.  Most  importantly,  we  had 
chosen  PASCAL  as  our  system  implementation  language,  and  we  wanted  to  implement 
first  the  tools  we  needed  ourselves  in  our  development  effort.  We  bootstrapped  as  soon 
as  the  core  of  the  system  was  implemented,  and  this  may  be  one  of  the  most  important 
practical  decisions  that  forced  us  to  focus  on  pragmatic  issues. 

The  first  step  in  this  effort  was  the  design  of  a  structured  editor  for  PASCAL  pro¬ 
grams,  implemented  in  MENTOR-PASCAL.  That  is,  we  wrote  a  number  of  MENTOL 
procedures  that  are  the  main  user  commands  to  construct  and  modify  PASCAL  pro¬ 
grams  and  their  documentation.  Some  of  these  procedures  are  used  to  move  in  the  ait 
of  the  program  according  to  higher-level  concepts.  For  instance,  FPROC  is  used  to 
move  to  the  top  of  a  procedure;  it  first  asks  the  user  the  name  of  this  procedure.  VAR 
goes  to  the  immediately  surrounding  variable  declaration  part,  etc  Other  MENTOL 
procedures  effect  Simple  program  transformations.  For  instance,  LABEL  is  used  to  label 
a  statement.  It  requcFts  the  label  name  from  the  user,  verifies  that  this  label  is  neither 
declared  nor  used  in  the  current  environment,  declares  it,  and  finally  labels  the  statement 
pointed  to  by  the  current  marker.  All  these  manipulations  aie  transparent  to  the  user, 
as  Jong  as  no  error  condition  occurs. 

We  then  turned  our  effort  to  implementing  tools  for  the  normalization  and  documen¬ 
tation  of  PASCAL  programs.  Normalizing  programs  consists  of  arranging  them  in  a 
standard,  more  readeable  form,  while  preserving  semantic  equivalon'-e  For  instance, 
declarations  may  be  rearranged  so  that  logically  related  items  he  declared  in  the  same 
area.  Various  cleaning-up  operations  are  performed,  to  get  rid  of  unne*  essary  structures 
(empty  statements,  compound  statements,  etc.)  This  is  especially  important  when  a 
series  of  program  transformations  have  been  applied  mechanically,  since  often  they  are 
easier  to  program  with  redundant  structure.  Of  course  none  of  ihesc  simplifications 
should  get  rid  of  comments.  Automatic  documentation  consists  of  generating  comments 
automaticaliy  at  various  standard  places  in  the  program,  generating  scope  structures, 
cross-reference  tables,  etc.  Some  of  these  may  involve  complicated  computations  on  the 
program.  The  basic  philosophy  is  that  all  generated  documentation  \%  itself  structured, 
so  that  it  can  be  used  by  further  processes.  We  do  not  elaborate  furthe**  on  normalization 
and  documentation  of  programs  in  MENTOR,  and  refer  the  interested  reader  to  (6l. 

Another  area  we  started  to  investigate  was  an  approach  to  debugging  by  source- 
level  program  manipulation.  The  idea  is  that,  instead  of  giving  you  run-time  debugging 
tools  that  have  a  more  or  less  satisfactory  user  interface,  we  shall  provide  you  with 
special  versions  of  your  source  program,  with  user  interfaces  built  m.  You  can  compile 
and  run  these  special  versions  using  your  standard  production  compiler.  For  instance,  a 


procedure  PROFILE  allows  you  to  compute  an  execution  profile  of  your  program  at  a 
tide-effect  of  your  main  computation.  We  think  this  area  is  worthy  of  more  research. 

The  effort  of  designing  and  implementing  a  bona  fide  programming  environment 
bated  on  MENTOR-PASCAL  is  still  going  on.  Rather  than  listing  in  painstaking  detail 
all  that  is  available  to  the  user  in  the  current  state  of  the  system,  let  us  discuss  what 
it  our  idea  of  a  satisfactory  environment,  and  what  problems  we  are  encountering  in  its 
implementation. 

The  main  philosophy  of  our  programming  environment  is  to  build  specialised  inter¬ 
preters,  that  help  the  programmer  by  doing  various  computations  and  rearrangements 
on  his  programs.  AH  these  interpreters  communicate,  between  themselves  as  well  as  with 
the  user,  through  the  abstract  syntax  of  PASCAL  and  its  annotations.  The  development 
of  a  program  is  conceived  as  a  multi-pass  activity,  each  processor  using  as  assumptions 
the  normalisation  and  computations  effected  by  the  previous  passes.  For  instance,  the 
^'correction”  of  a  piece  of  program  may  be  progressively  checked/debugged  according  to 
the  following  scenario: 

s  As  soon  as  the  program  is  input,  it  is  correct  as  far  as  its  context-free  structure 
is  concerned,  and  this  will  be  inforced  by  MENTOR’S  typing  mechanism  during  any 
further  transformation. 

•  Then  a  “scoper”  processes  the  program,  checking  for  existence  of  declarations  for 
the  various  identifiers  used  in  the  program.  This  pass  may  be  described  as  “computing 
the  lambda-calculus  skeleton”  of  the  program. 

•  When  all  names  are  linked  to  their  proper  declaration,  it  is  easy  to  write  a 
type-checker,  that  wjU  check  for  the  correct  typing  of  all  the  programming  language 
operations.  This  step  is  conceptually,  and  indeed  in  our  scenario,  implemented  as,  a 
non-standard  interpretation  of  the  programming  language  constructs.  A  complete  set 
of  MENTOL  procedures  for  PASCAL  scope  and  type  checking  has  been  developed,  and 
used  to  develop  type-preserving  manipulations  in  MENTOR-PASCALj7|. 

•  At  this  stage  it  is  natural  to  check  for  run-time  errors,  termination,  aliases.  Here 
we  need  much  more  semantic  information.  Most  of  the  checks  mentioned  are  undecidable 
in  general,  but  easy  sufficient  conditions  are  reasonable  to  implement.  These  checks  can 
be  realised  by  the  combination  of  specialised  data-flow  analysis  routines  and  a  general 
symbolic  interpreter.  A  set  of  MENTOL  procedures  that  check  aliasing  in  PASCAL  and 
its  application  to  proving  sufficient  conditions  for  a  procedure  to  be  free  of  /^ide-effects 
is  described  in  |9|. 

•  The  hardest  part  of  program  verification  remains,  checkinn;  that  the  program 
actually  corresponds  to  what  the  programmer  expected.  The  traditional  approach  would 
be  to  implement  a  debugging  interpreter,  which  would  ex^^cute  directly  from  ti  e  abstract 
syntax  and  various  other  structures  (symbol  tables)  constructed  by  thr  above  processes. 
A  more  formal  approach  would  request  from  the  user  to  state  formal  specifications,  such 
as  first-order  assertions,  and  to  check  the  adequacy  of  the  program  with  respect  to  its 
specifications.  For  instance,  verification  conditions  may  be  generated  through  symbolic 
execution,  and  then  input  to  a  theorem  prover.  The  formulas  generated,  as  well  as 
the  proof  trees,  would  of  course  be  in  turn  ast’s  manipulablc  by  MENTOR;  the  user 
could  therefore  monitor  the  proof  with  the  same  tools  he  is  using  for  manipulating  his 
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programs.  This  semi-automated  approach  would  alleviate  the  difficulty  of  having  to 
implement  a  completely  automatic  theorem  prover,  a  task  which  is  still  beyond  the  itate 
of  the  art.  Another  rigorous  approach  would  be  to  process  in  MENTOR  a  complete 
description  of  the  semantics  of  the  programming  language,  using  for  instance  semantic 
equations,  and  use  it  to  translate  a  program  into  its  denotation.  This  allows  us  to  get 
away  from  the  idiosyncrasies  of  the  particular  progranuning  language  used  and  limit  the 
proofs  to  identities  betv/een  mathematically  wcli-undcrstood  concepts.  Such  a  grandiose 
meta-system  covihl  ho  conceived  as  essentially  combining  the  capabilities  of  the  SIS[?j 
and  LCF(5|  systems  within  ^dENTOR. 

A  similar  scenario  can  easily  be  designed  for  program  optimisation:  local  optimisa* 
tions  are  performed  by  program  transformations,  then  more  global  optimisations  are 
effected  at  the  source  level  after  doing  the  necessary  computations  by  MENTOL  pro¬ 
cedures.  The  prograiu  is  then  compiled  in  an  object  code  which  has  its  own  abstract 
syntax.  Final  optimixalions  are  performed  by  transformations  on  the  object  code. 

The  general  strategy  behind  a  programming  environment  as  sketched  above  is  to 
effect  successive  refinements  or.  the  original  program,  by  going  from  the  simpler,  better 
understood  tasks,  io  the  more  sophisticated  and  costly  verifications.  However,  only  a 
small  fraction  of  above  ambitious  plan  has  been  actually  implemented  in  MENTOR* 
PASCAL.  There  are  mostly  two  fea*;onB  for  this,  which  are  actually  complementary 
aspects  of  the  same  phenomenon 

1)  Even  the  easiest  and  most  raiural  program  transformationr  are  hard  to  imple¬ 
ment  in  a  total!}'  •safe  w^ay  in  the  current  state  of  baroquer.ess  of  programming  languages. 
For  instance,  it  is  impossible  in  PASCAL  to  separate  scopc-ch^cking  from  type-checking 
because  of  the  with  construct  The  lack  of  orthogonality  of  the  language  makes  it  a 
complex  and  costly  process  to  do  but  the  most  trivial  program  tranfifo'’mation8.  For 
instance,  replacing  tail  recursions  by  gotos  in  recursive  procedures  with  call-hy-value 
arguments  represents  about  200  lines  of  MENTOL  procedures.  Again,  the  assumption 
must  be  made  (and  checked)  that  no  with  statement  occurs. 

2)  The  more  mundane  transformations  have  proved  to  be  challengin?^  ond  interest¬ 

ing  research  problems  Their  careful  implementation  is  often  crucial,  since  many  com¬ 
putations  involved  turn  out  to  be  very  time  consuming  An  espe'^'allv  inte’*^stin^  area  of 
applications  is  the  transport  of  programs.  Our  largest  application  so  fa*-  w  is  to  transport 
MENTOR  from  its  original  IRIS  80  implementation  to  its  PDF  10  v*  oi»  '10. s  is  per¬ 
formed  in  a  completely  mechanical  manner  by  a  set  of  MEN'hOP  his  way 

any  new  release  of  MFNTOR  can  be  followed  (after  a  few  hours  o»  '^onpuiatton)  by  a 
release  of  a  totally  compatible  PDP  10  release. 

The  conclusion  we  draw*  from  this  state  of  affairs  that  no  sal-sfactory 

programming  environment  will  exist  for  ugly  languages.  On  the  n.ind,  is  clear 

that  purely  applicative  languages  are  not  about  to  be  widely  accepted;  in  tho  real  world 
of  programming,  complex  data  structures  with  sharing  and  romplet  ccr.lrol  structures 
and  parameter  passing  mechanisms  are  the  rule  rather  than  the  vv*c  arc  not 

arguing  in  favor  of  simplistic  toy-programming  languages,  oui  {ho  point  is  ravher  that 
the  study  of  program  transformations  provides  interesting  guidehr  ofi  Mie  design  of 
future  programming  languages.  As  might  be  expectec  these  design  •  r.!*"  la  ar*'  closely 
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related  to  those  based  on  semantic  considerationi[10].  We  have  good  hope  that  the  state 
of  affairs  will  improve  with  the  advent  of  new  programming  languages  whose  design  will 
have  benefited  from  programming  languages  semantics  research  and  experience  gained 
with  systems  such  as  MENTOR.  A  positive  step  in  this  direction  has  been  taken  with 
the  ADA  language  development,  since  the  design  of  the  language  included  a  formad 
semantics  definition.  It  is  interesting  to  note  that  this  formal  semantics  is  based  on  a 
MENTOR-compatible  abstract  syntax  definition. 


2.  The  Main  Design  Decisions  in  MENTOR 


2.1  The  abstract  syntax 

The  notion  of  abstract  syntax  is  familiar  to  any  compiler  writer.  It  is  a  tree-like 
representation  of  the  structure  of  programs.  Operators  of  the  abstract  syntax  are  the 
basic  building  blocks  of  the  language.  We  want  to  strongly  emphasise  that  abstract 
syntax  is  NOT  parse  trees.  It  is  indeed  very  different  conceptually,  although  our  trees 
can  be  obtained  by  collapsing  and  normalising  parse  trees.  Here  are  a  few  important 
differences: 

1)  Lists  are  represented  as  one  list  node,  not  as  binary  trees 

2)  The  reserved  words  of  the  language  occur  as  node  labels,  not  as  leaves 

3)  Non-terminals  of  the  grammar  do  not  generate  nodes.  Certain  correspond  to 
sorts,  others  do  not  appear  at  all.  For  instance,  an  identifier  may  occur  directly  at 
an  expression,  the  intermediate  levels  of  parsing  such  as  simpleexp,  factor,  term  being 
collapsed. 

4)  Parentheses  are  NOT  part  of  the  structure,  they  are  generated  optionally  by  the 
unparser  if  the  context  requires  it,  because  of  precedence  reasons  for  instance. 

Point  3  is  particularly  important:  every  node  of  the  abstract  syntax  leaves  a  con¬ 
cretely  visible  mark  in  the  print-out,  and  this  is  a  big  help  for  the  user  going  up  and 
down  the  tree.  This  makes  MENTOR  significantly  different  from  previous  structured 
editors  such  as  Hansen's,  where  the  user  moved  around  in  hit  program  with  the  help  of 
grammar  menus. 

Point  4  is  important  too.  For  the  MENTOR  user  an  axp  is  an  axp.  Precedence 
relations  are  left  for  the  unparser  to  worry  about.  For  instance,  with  the  example  above: 

rOTOP  PQllULT;81Ca^ 

CEXP] 


Similarly,  the  problem  of  PASCAL'S  dangling  else  completely  vanishes: 


fOTOP  82  CM^  %w  Chang#  th#  then  pert  of  en  if 

CBTAT]:ir  Y<X  THEN  X:=Y;^  Xlnto  e  conditional  etatenent 

rOTOP  P^ 

If  X>0  then 

If  Y<X  then  X:sY  elae  Xnote  the  extra  elee  generated 

elee  Z:=0 

MENTOR  trees  are  not  LISP  trees  either.  Even  for  the  LISP  language,  the 
coding  of  programs  as  binary  trees  with  atom  leaves  is  rather  remote  from  the  abstract 
structure  of  the  program.  Also  points  1  and  2  above  apply.  Our  structure  is  much 
richer  structurally;  for  instancCi  in  MENTOR-PASCAL,  we  have  about  100  operators, 
whereas  LISP  structures  have  only  one  (cons).  For  these  reasons,  we  consider  MENTOR 
significantly  different  from  say  the  iNTERLISP  editor. 

So  much  for  the  choice  of  the  general  formalism  of  abstract  syntax.  Of  course 
for  each  particular  language  there  is  a  certain  degree  of  freedom  in  the  design  of  the 
particular  operators  and  sorts.  As  we  mentioned  above,  it  is  crucial  that  almost  every 
operator  add  some  concrete  representation  to  the  unparsing  of  a  piece  of  program.  An 
important,  but  not  mandatory,  requirement  is  that  the  unparsing  of  an  operator  should 
not  depend  too  much  on  the  context  in  which  it  occurs.  This  requirement  is  met  by  most 
operators  is  MENTOR-PASCAL,  except  that  certain  nodes  are  sometimes  surrounded 
by  parentheses  according  to  the  context.  There  arc  mostly  two  occurrences  of  this 
phenomenon: 

a)  parentheses  surrounding  list  nodes  may  change  with  the  context.  For  instance, 
an  Istat  is  usually  unparsed  as  a  compound  begin  •  end,  except  when  appearing  as 
the  loop  of  a  repeat. 

b)  parentheses  may  be  needed  for  precedence,  or  dangling  structures  such  as  shown 
above  for  the  else.  Our  unparser  always  generates  the  minimum  number  of  parentheses 
needed  for  a  correct  parsing.  This  is  the  only  normalisation  (besides  indentation  of 
course)  that  is  completely  automatic  and  over  which  the  user  has  no  control. 

When  designing  an  abstract  syntax  for  a  specific  language,  the  follov/ing  trade-off 
occurs.  Various  constructs  of  the  language  may  be  represented  by  the  same  concrete 
strings.  Now  there  is  a  choice  as  to  whether  you  want  to  separate  these  two  constucts 
as  two  distinct  operators,  or  if  you  want  to  merge  them  into  one  The  maximum 
discrimination  has  the  advantage  that  your  structure  will  have  a  grain;  for  instance, 
you  will  catch  by  the  find  command  instances  of  one  construct  incependently  from 
instances  of  the  other  On  the  other  hand,  certam  program  will  be 

harder,  and  the  user  has  more  constructs  to  learn.  For  instance  should  parameter 
declarations  use  the  same  construct  as  variable  declarations?  As  might  be  expected, 
referential  transparency  and  orthogonality  arc  important  properties  fo;  programming 
language  to  possess  for  a  completely  satisfactory  design  of  its  abstryct  syntax. 

We  feel  that  allowing  arbitrary  annotations  of  nodes  by  abstract  syntax  trees  in 
specialised  languages  was  an  important  design  decision  This  makes  our  system  open 
ended  to  various  developments,  without  interfering  with  the  tools  al^f^ady  designed:  a 
given  interpreter  may  have  access  to  certain  annotations,  the  others  being  invisible.  For 
instance,  certain  annotations  are  comments  for  the  user  to  see.  Others  may  be  pragmats 
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for  the  compileri  gpecificationi  in  some  formal  language  for  use  by  a  verifier,  data 
structures  for  control  flow  analysis,  original  code  commenting  some  optimised  sections, 
example  runs,  assertions  for  run-time  checks,  etc.  It  is  important  that  there  various 
structures  do  not  interfere  with  one  another  and  with  the  program  itself. 

It  may  be  appropriate  to  discuss  here  why  we  decided  to  stick  to  trees,  and  not  go 
to  more  complicated  graph  structures,  such  as  (shared)  dags  or  control  flow  graphs.  The 
main  reason  is  that  we  know  how  to  keep  the  integrity  of  these  context-free  structures 
in  an  incremental  way  For  instance,  we  could  imagine  keeping  the  programs  correct 
according  to  the  full  PASCAL  syntax,  including  type  checking  for  instance.  But, 
aside  from  the  fact  that  it  would  be  a  lot  more  costly  to  maintain  all  the  information 
needed  for  checking  this  correctness  during  the  edition  of  the  program,  this  would  have 
the  additional  (and  to  our  opinion  insuperable)  drawback  that  it  would  preclude  the 
development  of  programs  but  in  the  most  awkward  fashion. 

2.2  MENTOL 

MENTOL  is  our  trce<manipuiation  language.  The  above  description  of  its  main 
commands  gives  a  flavor  of  MENTOL  programming.  The  salient  features  of  the  language 
are: 

a)  it  is  an  interactive  language,  used  for  editing;  but  it  may  also  be  used  to  program 
lenghty  batch  computations. 

b)  it  is  not  applicative;  MENTOL  constructs  divide  into  expressions,  that  are  simply 
evaluated  for  their  result,  and  commands,  which  have  various  side  effects. 

c)  it  is  a  specialized  language,  for  manipulating  trees;  it  has  no  pretense  of  being 
general-purpose,  although  it  has  rudimentary  arithmetic  capabilities. 

d)  it  has  reasonably  good  user  interaction  facilities:  there  are  various  debugging 
aids  k;uch  as  a  trace  package,  an  interrupt  facility,  and  the  user  may  execute  in  coroutine 
with  programs,  a  very  handy  feature  for  “controlled’*  program  manipulation. 

e)  MENTOL  has  its  own  abstract  syntax.  It  is  therefore  possible  to  edit  and  develop 
MENTOL  programs  under  MENTOR.  Actually  a  standard  facility  exists  to  go  back  and 
forth  between  a  PASCAL  editing  session  and  a  MENTOL  editing  environment,  in  which 
the  (advanced)  user  may  modify  his  PASCAL  manipulation  programs 

f)  File  manipulation  primitives  are  provided.  Several  formats  of  flies  are  known  to 
MENTOR:  standard  text  files,  that  may  be  input  (through  parsing)  and  output  (through 
unparsing).  Tree  flies,  that  keep  asts  from  one  session  to  the  next  without  the  need  to 
reparse.  MENTOL  files,  containing  MENTOL  procedures,  and  a  special  case  of  which 
is  used  as  the  initialisation  flie,  loading  a  specific  user's  editing  prelude. 

Pattern  matching  deserves  a  little  discussion.  As  we  already  insisted,  pattern 
matching  is  a  fundamental  operation  in  MENTOL.  The  user  may  construct  any  tree 
pattern,  i.e.  any  ast  with  metavariables  occurring  anywhere.  However,  metavariables 
may  occur  in  only  one  occurrence.  This  condition  is  required  because  of  the  side-effects  to 
the  corresponding  markers.  Note  that  this  is  not  really  a  restriction,  since  a  primitive  is 
provided  for  testing  equality  of  trees.  No  list  metavariables  are  provideo  at  the  moment, 
because  associative  pattern  matching  is  a  complicated  operation  (a  tree  may  match  a 
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pattern  in  more  than  one  wao^  if  luch  list  variables  are  allowed)^  and  because  it  was  never 
strongly  felt  as  desirable,  except  probably  for  orthogonality.  But  we  want  to  stress  the 
considerable  pattern  matching  capability  we  have  in  MENTOR,  as  opposed  say  to  string 
searching  in  a  more  conventional  editor.  Anybody  who  tries  to  trace  uses  of  identifier  I 
in  his  program  (as  opposed  to  all  occurrences  of  character  I,  in  other  idemihers,  reserved 
words,  strings  and  comments!)  will  understand  this  point.  Furthermore  the  MENTOL 
pattern  matching  is  fast,  because  types  are  used  to  focus  the  search  on  the  relevant  part 
of  the  trees.  For  instance,  MENTOR  knows  that  in  PASCAL  statements  arc  disjoint 
from  declarations,  and  may  not  occur  in  expressions.  It  will  therefore  focus  the  search 
for  a  statement  on  a  narrow  region  in  the  program  tree  (and  of  course  comments  will  not 
get  in  the  way  either).  We  may  therefore  argue  that  tree  pattern  matching  is  actually 
faster  than  string  pattern  matching.  We  believe  this  is  one  of  the  main  arguments  for 
having  typed  rather  than  untyped  structures. 

It  is  clear  that  MENTOL  is  not  the  last  word  in  tree-manipulation  languages. 
However,  we  wanted  to  acquire  a  reasonable  amount  of  experience  with  writing  program 
transformations  in  MENTOL  before  drawing  definite  conclusions  about  such  languages. 
All  in  all,  MENTOL  has  well  served  its  purpose:  it  is  easy  to  learn,  it  is  fairly  easy  to 
debug,  it  is  fast  enough  for  editing.  However  long  MENTOL  procedures  are  hard  to 
read,  and  a  compiler  is  clearly  needed  for  complicated  transformations  done  in  batch 
mode. 


2.3  A  special  word  for  screen  editor  fans. 

One  of  the  most  commonly  heard  criticisms  of  MENTOR  is  that  it  should  be 
possible  to  edit  progtams  on  your  screen  in  the  same  way  as  for  instance  with  the  EMACS 
editor.  We  do  not  believe  that  this  would  be  an  easy  task,  and  we  do  not  even  think 
that  such  a  facility  is  really  desirable. 

The  first  point  concerns  portability.  In  the  initial  MENTOR  design,  we  had  planned 
to  define  a  screen  as  partitioned  between  several  areas,  and  to  have  the  text  under  the 
current  marker  represented  specially  on  the  screen.  We  went  as  far  as  implementing 
these  displays,  but  then  changed  our  minds,  mostly  because  it  was  very  hard  to  distribute 
our  system.  We  reverted  to  teletype-compatible  output.  MENTOR  can  be  transported 
to  any  machine  with  any  interactive  operating  system  (modulo  the  PASCAL  transport 
problems,  of  course).  No  special  terminals  are  needed,  but  of  course  the  system  will  be 
more  pleasant  to  use  if  the  rate  of  transmission  is  higher,  so  that  it  is  not  too  costly  to 
have  the  current  marker  expression  printed  often. 

The  second  point  concerns  the  difficulty  to  maintain  two  separate  representations. 
Remember,  the  text  printed  on  your  screen  is  nowhere  kept;  it  is  just  computed  on 
demand  by  the  prettyprinter.  The  ability  to  manipulate  screen  images  would  force  us  to 
keep  the  printed  text  internally,  and  to  try  and  link  it  to  the  corresponding  ast.  After 
some  modification  is  effected  on  the  screen,  the  parser  would  have  to  be  called  in  action 
to  validate  the  changes  before  updating  the  tree  accordingly.  The  dirficultios  may  not 
be  insurmountable,  but  it  is  not  clear  that  the  end  result  would  be  worth  the  effort. 
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Finally,  a  major  drawback  of  mixing  structured  editing  and  display  editing  is  that 
the  user  would  have  to  learn  how  to  use  two  command  languages  instead  of  one.  We 
believe  that  most  users  would  stick  to  either  mode,  but  would  not  like  mixing  them. 

Conclusion 

MENTOR  h  as  been  used  for  most  of  its  own  development  and  niainte nance.  Varioiis 
groups  at  INRIA  use  MEN  FOR  as  their  main  programming  tool  for  developing  ASCAL 
programs.  MENTOR  hak  been  distributed  in  various  research  and  teaching  institutions. 
In  particular,  it  is  being  used  at  University  de  Toulouse  for  teaching  programming  in 
PASCAL.  Using  MENTOR  requires  some  training.  It  seems  that  in  the  average  a 
PASCAL  programmer  needs  about  a  week  to  get  accustomed  to  this  new  world  of  trees. 
Past  this  period,  few  return  to  the  standard  *ools. 

It  is  our  thesis  that  using  an  abstract  manipulation  system  as  the  core  of  a  prograni* 
ming  environment  is  a  good  paradigm.  However,  >t  is  very  important  that  the  user  may 
correspond  to  the  system  through  the  concrete  syr.tax  he  is  u^ed  to:  he  should  be  able  to 
visualise  his  trees  with  unparsing,  and  conversely  input  his  prograr:.  tr?.  ^  th  parsing 
The  abstract  syntax  manipulation  language  should  have  a  powerful  procedure  abstrac* 
tion  mechanism,  permitting  to  extend  the  system  at  will  with  complicated  semantic 
checking,  such  as  data  flow  analysis  and  ultimately  formal  proofs.  It  is  important  to  be 
able  to  manipulate  structured  annotations,  linked  to  the  structure  of  the  program,  but 
conceived  as  separate  entities  and  not  as  extensions  to  the  user’s  programming  language 
syntax.  We  envision  a  satisfactory  programming  environment  as  unifying  under  a  com¬ 
mon  set  of  tools  the  whole  range  of  a  programming  team's  activity:  design,  development, 
documentation,  debugging,  maintenance  and  transport.  However  the  Jong  range  goal  of 
software  reliability  will  be  attainable  only  when  new  programming  languages,  designed 
along  sound  semantic  principles,  will  be  available. 
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